Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34.642
Filtrar
1.
Genome Biol ; 25(1): 83, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38566111

RESUMO

BACKGROUND: The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS: Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS: Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes.


Assuntos
DNA , Sequências Reguladoras de Ácido Nucleico , Sítios de Ligação , Alinhamento de Sequência , Algoritmos , Sequência Conservada/genética , Evolução Molecular
2.
Nature ; 628(8006): 186-194, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38509362

RESUMO

Drug-resistant bacteria are emerging as a global threat, despite frequently being less fit than their drug-susceptible ancestors1-8. Here we sought to define the mechanisms that drive or buffer the fitness cost of rifampicin resistance (RifR) in the bacterial pathogen Mycobacterium tuberculosis (Mtb). Rifampicin inhibits RNA polymerase (RNAP) and is a cornerstone of modern short-course tuberculosis therapy9,10. However, RifR Mtb accounts for one-quarter of all deaths due to drug-resistant bacteria11,12. We took a comparative functional genomics approach to define processes that are differentially vulnerable to CRISPR interference (CRISPRi) inhibition in RifR Mtb. Among other hits, we found that the universally conserved transcription factor NusG is crucial for the fitness of RifR Mtb. In contrast to its role in Escherichia coli, Mtb NusG has an essential RNAP pro-pausing function mediated by distinct contacts with RNAP and the DNA13. We find this pro-pausing NusG-RNAP interface to be under positive selection in clinical RifR Mtb isolates. Mutations in the NusG-RNAP interface reduce pro-pausing activity and increase fitness of RifR Mtb. Collectively, these results define excessive RNAP pausing as a molecular mechanism that drives the fitness cost of RifR in Mtb, identify a new mechanism of compensation to overcome this cost, suggest rational approaches to exacerbate the fitness cost, and, more broadly, could inform new therapeutic approaches to develop drug combinations to slow the evolution of RifR in Mtb.


Assuntos
Proteínas de Bactérias , Farmacorresistência Bacteriana , Evolução Molecular , Aptidão Genética , Mycobacterium tuberculosis , Rifampina , Humanos , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sequência Conservada , RNA Polimerases Dirigidas por DNA/antagonistas & inibidores , RNA Polimerases Dirigidas por DNA/genética , RNA Polimerases Dirigidas por DNA/metabolismo , Farmacorresistência Bacteriana/efeitos dos fármacos , Farmacorresistência Bacteriana/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Genômica , Mutação , Mycobacterium tuberculosis/efeitos dos fármacos , Mycobacterium tuberculosis/enzimologia , Mycobacterium tuberculosis/genética , Mycobacterium tuberculosis/metabolismo , Fatores de Alongamento de Peptídeos/genética , Fatores de Alongamento de Peptídeos/metabolismo , Rifampina/farmacologia , Rifampina/uso terapêutico , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Tuberculose Resistente a Múltiplos Medicamentos/tratamento farmacológico , Tuberculose Resistente a Múltiplos Medicamentos/microbiologia
3.
Genome Biol Evol ; 16(4)2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38502060

RESUMO

Conserved noncoding elements (CNEs) are DNA sequences located outside of protein-coding genes that can remain under purifying selection for up to hundreds of millions of years. Studies in vertebrate genomes have revealed that most CNEs carry out regulatory functions. Notably, many of them are enhancers that control the expression of homeodomain transcription factors and other genes that play crucial roles in embryonic development. To further our knowledge of CNEs in other parts of the animal tree, we conducted a large-scale characterization of CNEs in more than 50 genomes from three of the main branches of the metazoan tree: Cnidaria, Mollusca, and Arthropoda. We identified hundreds of thousands of CNEs and reconstructed the temporal dynamics of their appearance in each lineage, as well as determining their spatial distribution across genomes. We show that CNEs evolve repeatedly around the same genes across the Metazoa, including around homeodomain genes and other transcription factors; they also evolve repeatedly around genes involved in neural development. We also show that transposons are a major source of CNEs, confirming previous observations from vertebrates and suggesting that they have played a major role in wiring developmental gene regulatory mechanisms since the dawn of animal evolution.


Assuntos
Sequências Reguladoras de Ácido Nucleico , Vertebrados , Animais , Sequência Conservada/genética , Vertebrados/genética , Sequência de Bases , Fatores de Transcrição/genética , Evolução Molecular
4.
J Virol ; 98(3): e0182723, 2024 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-38305183

RESUMO

Most icosahedral DNA viruses package and condense their genomes into pre-formed, volumetrically constrained capsids. However, concurrent genome biosynthesis and packaging are specific to single-stranded (ss) DNA micro- and parvoviruses. Before packaging, ~120 copies of the øX174 DNA-binding protein J interact with double-stranded DNA. 60 J proteins enter the procapsid with the ssDNA genome, guiding it between 60 icosahedrally ordered DNA-binding pockets formed by the capsid proteins. Although J proteins are small, 28-37 residues in length, they have two domains. The basic, positively charged N-terminus guides the genome between binding pockets, whereas the C-terminus acts as an anchor to the capsid's inner surface. Three C-terminal aromatic residues, W30, Y31, and F37, interact most extensively with the coat protein. Their corresponding codons were mutated, and the resulting strains were biochemically and genetically characterized. Depending on the mutation, the substitutions produced unstable packaging complexes, unstable virions, infectious progeny, or particles packaged with smaller genomes, the latter being a novel phenomenon. The smaller genomes contained internal deletions. The juncture sequences suggest that the unessential A* (A star) protein mediates deletion formation.IMPORTANCEUnessential but strongly conserved gene products are understudied, especially when mutations do not confer discernable phenotypes or the protein's contribution to fitness is too small to reliably determine in laboratory-based assays. Consequently, their functions and evolutionary impact remain obscure. The data presented herein suggest that microvirus A* proteins, discovered over 40 years ago, may hasten the termination of non-productive packaging events. Thus, performing a salvage function by liberating the reusable components of the failed packaging complexes, such as DNA templates and replication enzymes.


Assuntos
Bacteriófago phi X 174 , Proteínas do Capsídeo , DNA de Cadeia Simples , DNA Viral , Proteínas de Ligação a DNA , Evolução Molecular , Empacotamento do Genoma Viral , Bacteriófago phi X 174/química , Bacteriófago phi X 174/genética , Bacteriófago phi X 174/crescimento & desenvolvimento , Bacteriófago phi X 174/metabolismo , Capsídeo/química , Capsídeo/metabolismo , Proteínas do Capsídeo/genética , Proteínas do Capsídeo/metabolismo , Sequência Conservada , DNA de Cadeia Simples/metabolismo , DNA Viral/metabolismo , Proteínas de Ligação a DNA/genética , Proteínas de Ligação a DNA/metabolismo , Aptidão Genética , Mutação , Fenótipo , Moldes Genéticos , Vírion/química , Vírion/genética , Vírion/crescimento & desenvolvimento , Vírion/metabolismo
5.
J Biol Chem ; 300(3): 105736, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38336297

RESUMO

Advances in personalized medicine and protein engineering require accurately predicting outcomes of amino acid substitutions. Many algorithms correctly predict that evolutionarily-conserved positions show "toggle" substitution phenotypes, which is defined when a few substitutions at that position retain function. In contrast, predictions often fail for substitutions at the less-studied "rheostat" positions, which are defined when different amino acid substitutions at a position sample at least half of the possible functional range. This review describes efforts to understand the impact and significance of rheostat positions: (1) They have been observed in globular soluble, integral membrane, and intrinsically disordered proteins; within single proteins, their prevalence can be up to 40%. (2) Substitutions at rheostat positions can have biological consequences and ∼10% of substitutions gain function. (3) Although both rheostat and "neutral" (defined when all substitutions exhibit wild-type function) positions are nonconserved, the two classes have different evolutionary signatures. (4) Some rheostat positions have pleiotropic effects on function, simultaneously modulating multiple parameters (e.g., altering both affinity and allosteric coupling). (5) In structural studies, substitutions at rheostat positions appear to cause only local perturbations; the overall conformations appear unchanged. (6) Measured functional changes show promising correlations with predicted changes in protein dynamics; the emergent properties of predicted, dynamically coupled amino acid networks might explain some of the complex functional outcomes observed when substituting rheostat positions. Overall, rheostat positions provide unique opportunities for using single substitutions to tune protein function. Future studies of these positions will yield important insights into the protein sequence/function relationship.


Assuntos
Substituição de Aminoácidos , Aminoácidos , Proteínas , Sequência de Aminoácidos , Aminoácidos/genética , Aminoácidos/metabolismo , Sequência Conservada , Evolução Molecular , Proteínas Intrinsicamente Desordenadas/química , Proteínas Intrinsicamente Desordenadas/genética , Proteínas Intrinsicamente Desordenadas/metabolismo , Proteínas de Membrana/química , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Engenharia de Proteínas , Proteínas/química , Proteínas/genética , Proteínas/metabolismo , Relação Estrutura-Atividade , Humanos
6.
J Biol Chem ; 300(3): 105740, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38340794

RESUMO

Diseases caused by Leishmania and Trypanosoma parasites are a major health problem in tropical countries. Because of their complex life cycle involving both vertebrate and insect hosts, and >1 billion years of evolutionarily distance, the cell biology of trypanosomatid parasites exhibits pronounced differences to animal cells. For example, the actin cytoskeleton of trypanosomatids is divergent when compared with other eukaryotes. To understand how actin dynamics are regulated in trypanosomatid parasites, we focused on a central actin-binding protein profilin. Co-crystal structure of Leishmania major actin in complex with L. major profilin revealed that, although the overall folds of actin and profilin are conserved in eukaryotes, Leishmania profilin contains a unique α-helical insertion, which interacts with the target binding cleft of actin monomer. This insertion is conserved across the Trypanosomatidae family and is similar to the structure of WASP homology-2 (WH2) domain, a small actin-binding motif found in many other cytoskeletal regulators. The WH2-like motif contributes to actin monomer binding and enhances the actin nucleotide exchange activity of Leishmania profilin. Moreover, Leishmania profilin inhibited formin-catalyzed actin filament assembly in a mechanism that is dependent on the presence of the WH2-like motif. By generating profilin knockout and knockin Leishmania mexicana strains, we show that profilin is important for efficient endocytic sorting in parasites, and that the ability to bind actin monomers and proline-rich proteins, and the presence of a functional WH2-like motif, are important for the in vivo function of Leishmania profilin. Collectively, this study uncovers molecular principles by which profilin regulates actin dynamics in trypanosomatids.


Assuntos
Citoesqueleto de Actina , Actinas , Leishmania major , Parasitos , Profilinas , Animais , Humanos , Citoesqueleto de Actina/química , Citoesqueleto de Actina/metabolismo , Actinas/química , Actinas/metabolismo , Motivos de Aminoácidos , Sítios de Ligação , Sequência Conservada , Cristalização , Cristalografia por Raios X , Leishmania major/citologia , Leishmania major/metabolismo , Parasitos/citologia , Parasitos/metabolismo , Profilinas/química , Profilinas/metabolismo , Ligação Proteica , Domínios Proteicos
7.
Nucleic Acids Res ; 52(6): 3121-3136, 2024 Apr 12.
Artigo em Inglês | MEDLINE | ID: mdl-38375870

RESUMO

MicroRNAs (miRNAs) are important and ubiquitous regulators of gene expression in both plants and animals. They are thought to have evolved convergently in these lineages and hypothesized to have played a role in the evolution of multicellularity. In line with this hypothesis, miRNAs have so far only been described in few unicellular eukaryotes. Here, we investigate the presence and evolution of miRNAs in Amoebozoa, focusing on species belonging to Acanthamoeba, Physarum and dictyostelid taxonomic groups, representing a range of unicellular and multicellular lifestyles. miRNAs that adhere to both the stringent plant and animal miRNA criteria were identified in all examined amoebae, expanding the total number of protists harbouring miRNAs from 7 to 15. We found conserved miRNAs between closely related species, but the majority of species feature only unique miRNAs. This shows rapid gain and/or loss of miRNAs in Amoebozoa, further illustrated by a detailed comparison between two evolutionary closely related dictyostelids. Additionally, loss of miRNAs in the Dictyostelium discoideum drnB mutant did not seem to affect multicellular development and, hence, demonstrates that the presence of miRNAs does not appear to be a strict requirement for the transition from uni- to multicellular life.


Assuntos
Amebozoários , Evolução Molecular , MicroRNAs , RNA de Protozoário , Amebozoários/classificação , Amebozoários/genética , Dictyostelium/genética , MicroRNAs/genética , Filogenia , RNA de Protozoário/genética , Sequência Conservada/genética , Interferência de RNA
8.
Nucleic Acids Res ; 52(3): 1064-1079, 2024 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-38038264

RESUMO

mRNA translation is a fundamental process for life. Selection of the translation initiation site (TIS) is crucial, as it establishes the correct open reading frame for mRNA decoding. Studies in vertebrate mRNAs discovered that a purine at -3 and a G at +4 (where A of the AUG initiator codon is numbered + 1), promote TIS recognition. However, the TIS context in other eukaryotes has been poorly experimentally analyzed. We analyzed in vitro the influence of the -3, -2, -1 and + 4 positions of the TIS context in rabbit, Drosophila, wheat, and yeast. We observed that -3A conferred the best translational efficiency across these species. However, we found variability at the + 4 position for optimal translation. In addition, the Kozak motif that was defined from mammalian cells was only weakly predictive for wheat and essentially non-predictive for yeast. We discovered eight conserved sequences that significantly disfavored translation. Due to the big differences in translational efficiency observed among weak TIS context sequences, we define a novel category that we termed 'barren AUG context sequences (BACS)', which represent sequences disfavoring translation. Analysis of mRNA-ribosomal complexes structures provided insights into the function of BACS. The gene ontology of the BACS-containing mRNAs is presented.


Assuntos
Códon de Iniciação , Sequência Conservada , Biossíntese de Proteínas , Animais , Coelhos , Códon de Iniciação/genética , Mamíferos/genética , Iniciação Traducional da Cadeia Peptídica , RNA Mensageiro/metabolismo , Leveduras , Eucariotos/genética , Eucariotos/metabolismo
9.
Dev Growth Differ ; 66(1): 75-88, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37925606

RESUMO

Abnormal expression of the transcriptional regulator and hedgehog (Hh) signaling pathway effector Gli3 is known to trigger congenital disease, most frequently affecting the central nervous system (CNS) and the limbs. Accurate delineation of the genomic cis-regulatory landscape controlling Gli3 transcription during embryonic development is critical for the interpretation of noncoding variants associated with congenital defects. Here, we employed a comparative genomic analysis on fish species with a slow rate of molecular evolution to identify seven previously unknown conserved noncoding elements (CNEs) in Gli3 intronic intervals (CNE15-21). Transgenic assays in zebrafish revealed that most of these elements drive activities in Gli3 expressing tissues, predominantly the fins, CNS, and the heart. Intersection of these CNEs with human disease associated SNPs identified CNE15 as a putative mammalian craniofacial enhancer, with conserved activity in vertebrates and potentially affected by mutation associated with human craniofacial morphology. Finally, comparative functional dissection of an appendage-specific CNE conserved in slowly evolving fish (elephant shark), but not in teleost (CNE14/hs1586) indicates co-option of limb specificity from other tissues prior to the divergence of amniotes and lobe-finned fish. These results uncover a novel subset of intronic Gli3 enhancers that arose in the common ancestor of gnathostomes and whose sequence components were likely gradually modified in other species during the process of evolutionary diversification.


Assuntos
Elementos Facilitadores Genéticos , Peixe-Zebra , Animais , Humanos , Peixe-Zebra/genética , Peixe-Zebra/metabolismo , Elementos Facilitadores Genéticos/genética , Proteínas Hedgehog/genética , Proteínas Hedgehog/metabolismo , Animais Geneticamente Modificados , Mamíferos , Evolução Molecular , Sequência Conservada/genética
10.
Nature ; 625(7996): 735-742, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-38030727

RESUMO

Noncoding DNA is central to our understanding of human gene regulation and complex diseases1,2, and measuring the evolutionary sequence constraint can establish the functional relevance of putative regulatory elements in the human genome3-9. Identifying the genomic elements that have become constrained specifically in primates has been hampered by the faster evolution of noncoding DNA compared to protein-coding DNA10, the relatively short timescales separating primate species11, and the previously limited availability of whole-genome sequences12. Here we construct a whole-genome alignment of 239 species, representing nearly half of all extant species in the primate order. Using this resource, we identified human regulatory elements that are under selective constraint across primates and other mammals at a 5% false discovery rate. We detected 111,318 DNase I hypersensitivity sites and 267,410 transcription factor binding sites that are constrained specifically in primates but not across other placental mammals and validate their cis-regulatory effects on gene expression. These regulatory elements are enriched for human genetic variants that affect gene expression and complex traits and diseases. Our results highlight the important role of recent evolution in regulatory sequence elements differentiating primates, including humans, from other placental mammals.


Assuntos
Sequência Conservada , Evolução Molecular , Genoma , Primatas , Animais , Feminino , Humanos , Gravidez , Sequência Conservada/genética , Desoxirribonuclease I/metabolismo , DNA/genética , DNA/metabolismo , Genoma/genética , Mamíferos/classificação , Mamíferos/genética , Placenta , Primatas/classificação , Primatas/genética , Sequências Reguladoras de Ácido Nucleico/genética , Reprodutibilidade dos Testes , Fatores de Transcrição/metabolismo , Proteínas/genética , Regulação da Expressão Gênica/genética
11.
J Biol Chem ; 300(2): 105611, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38159848

RESUMO

During growth, bacteria remodel and recycle their peptidoglycan (PG). A key family of PG-degrading enzymes is the lytic transglycosylases, which produce anhydromuropeptides, a modification that caps the PG chains and contributes to bacterial virulence. Previously, it was reported that the polar-growing Gram-negative plant pathogen Agrobacterium tumefaciens lacks anhydromuropeptides. Here, we report the identification of an enzyme, MdaA (MurNAc deacetylase A), which specifically removes the acetyl group from anhydromuropeptide chain termini in A. tumefaciens, resolving this apparent anomaly. A. tumefaciens lacking MdaA accumulates canonical anhydromuropeptides, whereas MdaA was able to deacetylate anhydro-N-acetyl muramic acid in purified sacculi that lack this modification. As for other PG deacetylases, MdaA belongs to the CE4 family of carbohydrate esterases but harbors an unusual Cys residue in its active site. MdaA is conserved in other polar-growing bacteria, suggesting a possible link between PG chain terminus deacetylation and polar growth.


Assuntos
Agrobacterium tumefaciens , Proteínas de Bactérias , Agrobacterium tumefaciens/classificação , Agrobacterium tumefaciens/enzimologia , Agrobacterium tumefaciens/genética , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Parede Celular , Peptidoglicano , Amidoidrolases/genética , Amidoidrolases/metabolismo , Bactérias/classificação , Bactérias/genética , Bactérias/metabolismo , Sequência Conservada/genética , Deleção de Genes
12.
Nature ; 624(7991): 390-402, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38092918

RESUMO

Divergence of cis-regulatory elements drives species-specific traits1, but how this manifests in the evolution of the neocortex at the molecular and cellular level remains unclear. Here we investigated the gene regulatory programs in the primary motor cortex of human, macaque, marmoset and mouse using single-cell multiomics assays, generating gene expression, chromatin accessibility, DNA methylome and chromosomal conformation profiles from a total of over 200,000 cells. From these data, we show evidence that divergence of transcription factor expression corresponds to species-specific epigenome landscapes. We find that conserved and divergent gene regulatory features are reflected in the evolution of the three-dimensional genome. Transposable elements contribute to nearly 80% of the human-specific candidate cis-regulatory elements in cortical cells. Through machine learning, we develop sequence-based predictors of candidate cis-regulatory elements in different species and demonstrate that the genomic regulatory syntax is highly preserved from rodents to primates. Finally, we show that epigenetic conservation combined with sequence similarity helps to uncover functional cis-regulatory elements and enhances our ability to interpret genetic variants contributing to neurological disease and traits.


Assuntos
Sequência Conservada , Evolução Molecular , Regulação da Expressão Gênica , Redes Reguladoras de Genes , Mamíferos , Neocórtex , Animais , Humanos , Camundongos , Callithrix/genética , Cromatina/genética , Cromatina/metabolismo , Sequência Conservada/genética , Metilação de DNA , Elementos de DNA Transponíveis/genética , Epigenoma , Regulação da Expressão Gênica/genética , Macaca/genética , Mamíferos/genética , Córtex Motor/citologia , Córtex Motor/metabolismo , Multiômica , Neocórtex/citologia , Neocórtex/metabolismo , Sequências Reguladoras de Ácido Nucleico/genética , Análise de Célula Única , Fatores de Transcrição/metabolismo , Variação Genética/genética
13.
Mol Biol Evol ; 40(12)2023 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-38085182

RESUMO

DNA that controls gene expression (e.g. enhancers, promoters) has seemed almost never to be conserved between distantly related animals, like vertebrates and arthropods. This is mysterious, because development of such animals is partly organized by homologous genes with similar complex expression patterns, termed "deep homology." Here, we report 25 regulatory DNA segments conserved across bilaterian animals, of which 7 are also conserved in cnidaria (coral and sea anemone). They control developmental genes (e.g. Nr2f, Ptch, Rfx1/3, Sall, Smad6, Sp5, Tbx2/3), including six homeobox genes: Gsx, Hmx, Meis, Msx, Six1/2, and Zfhx3/4. The segments contain perfectly or near-perfectly conserved CCAAT boxes, E-boxes, and other sequences recognized by regulatory proteins. More such DNA conservation will surely be found soon, as more genomes are published and sequence comparison is optimized. This reveals a control system for animal development conserved since the Precambrian.


Assuntos
Antozoários , Genes Homeobox , Animais , DNA , Fatores de Transcrição/genética , Antozoários/genética , Desenvolvimento Embrionário/genética , Sequência Conservada/genética
14.
Sci Rep ; 13(1): 20391, 2023 11 21.
Artigo em Inglês | MEDLINE | ID: mdl-37990104

RESUMO

Patched domain-containing 1 (PTCHD1) is a well-established susceptibility gene for autism spectrum disorder (ASD) and intellectual disability (ID). Previous studies have suggested that alterations in the dosage of PTCHD1 may contribute to the etiology of both ASD and ID. However, there has not yet been a thorough investigation regarding mechanisms that regulate PTCHD1 expression. We sought to characterize the Ptchd1 promoter in a mouse neuronal model, as well as to identify and validate cis regulatory elements. We defined specific regions of the Ptchd1 promoter essential for robust expression in P19-induced neurons. Evolutionarily-conserved putative transcription factor binding sites within these regions were subsequently identified. Using a pairwise comparison of chromatin accessibility between mouse forebrain and liver tissues, a candidate regulatory region, ~ 9.1 kbp downstream of the Ptchd1 stop codon was defined. This region harbours two ENCODE-predicted enhancer cis-regulatory elements. Further, using DNase footprint analysis, a putative YY1-binding motif was also identified. Genomic deletion of the entire 8 kbp downstream open chromatin region attenuated Ptchd1 transcription by over 60% in our neuronal model, corroborating its predicted regulatory function. This study provides mechanistic insights related to the expression of PTCHD1, and provides important context to interpret genetic and genomic variation at this locus which may influence neurodevelopment.


Assuntos
Transtorno do Espectro Autista , Transtorno Autístico , Animais , Camundongos , Transtorno Autístico/genética , Transtorno do Espectro Autista/genética , Proteínas de Membrana/metabolismo , Neurônios/metabolismo , Sequência Conservada , Elementos Facilitadores Genéticos , Cromatina/genética
15.
Cell Syst ; 14(12): 1103-1112.e6, 2023 12 20.
Artigo em Inglês | MEDLINE | ID: mdl-38016465

RESUMO

The sequence in the 5' untranslated regions (UTRs) is known to affect mRNA translation rates. However, the underlying regulatory grammar remains elusive. Here, we propose MTtrans, a multi-task translation rate predictor capable of learning common sequence patterns from datasets across various experimental techniques. The core premise is that common motifs are more likely to be genuinely involved in translation control. MTtrans outperforms existing methods in both accuracy and the ability to capture transferable motifs across species, highlighting its strength in identifying evolutionarily conserved sequence motifs. Our independent fluorescence-activated cell sorting coupled with deep sequencing (FACS-seq) experiment validates the impact of most motifs identified by MTtrans. Additionally, we introduce "GRU-rewiring," a technique to interpret the hidden states of the recurrent units. Gated recurrent unit (GRU)-rewiring allows us to identify regulatory element-enriched positions and examine the local effects of 5' UTR mutations. MTtrans is a powerful tool for deciphering the translation regulatory motifs.


Assuntos
Sequências Reguladoras de Ácido Nucleico , Regiões 5' não Traduzidas/genética , Sequência Conservada
16.
Science ; 381(6664): eadg7492, 2023 09 22.
Artigo em Inglês | MEDLINE | ID: mdl-37733863

RESUMO

The vast majority of missense variants observed in the human genome are of unknown clinical significance. We present AlphaMissense, an adaptation of AlphaFold fine-tuned on human and primate variant population frequency databases to predict missense variant pathogenicity. By combining structural context and evolutionary conservation, our model achieves state-of-the-art results across a wide range of genetic and experimental benchmarks, all without explicitly training on such data. The average pathogenicity score of genes is also predictive for their cell essentiality, capable of identifying short essential genes that existing statistical approaches are underpowered to detect. As a resource to the community, we provide a database of predictions for all possible human single amino acid substitutions and classify 89% of missense variants as either likely benign or likely pathogenic.


Assuntos
Substituição de Aminoácidos , Doença , Mutação de Sentido Incorreto , Proteoma , Alinhamento de Sequência , Humanos , Substituição de Aminoácidos/genética , Benchmarking , Sequência Conservada , Bases de Dados Genéticas , Doença/genética , Genoma Humano , Conformação Proteica , Proteoma/genética , Alinhamento de Sequência/métodos , Aprendizado de Máquina
17.
Cell Rep Methods ; 3(8): 100543, 2023 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-37671027

RESUMO

The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as "pan-conserved segment tags" (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.


Assuntos
Neoplasias de Células Escamosas , Neoplasias Cutâneas , Humanos , Sequência Conservada , Haploidia , Polimorfismo Genético
18.
J Mol Biol ; 435(20): 168259, 2023 10 15.
Artigo em Inglês | MEDLINE | ID: mdl-37660941

RESUMO

An important pathogenicity factor of SARS-CoV-2 and related coronaviruses is Non-structural protein 1 (Nsp1), which suppresses host gene expression and stunts antiviral signaling. SARS-CoV-2 Nsp1 binds the ribosome to inhibit translation through mRNA displacement and induces degradation of host mRNAs. Here we show that Nsp1-dependent host shutoff is conserved in diverse coronaviruses, but only Nsp1 from ß-Coronaviruses (ß-CoV) inhibits translation through ribosome binding. The C-terminal domain (CTD) of all ß-CoV Nsp1s confers high-affinity ribosome binding despite low sequence conservation. Modeling of interactions of four Nsp1s with the ribosome identified the few absolutely conserved amino acids that, together with an overall conservation in surface charge, form the ß-CoV Nsp1 ribosome-binding domain. Contrary to previous models, the Nsp1 ribosome-binding domain is an inefficient translation inhibitor. Instead, the Nsp1-CTD likely functions by recruiting Nsp1's N-terminal "effector" domain. Finally, we show that a cis-acting viral RNA element has co-evolved to fine-tune SARS-CoV-2 Nsp1 function, but does not provide similar protection against Nsp1 from related viruses. Together, our work provides new insight into the diversity and conservation of ribosome-dependent host-shutoff functions of Nsp1, knowledge that could aid future efforts in pharmacological targeting of Nsp1 from SARS-CoV-2 and related human-pathogenic ß-CoVs. Our study also exemplifies how comparing highly divergent Nsp1 variants can help to dissect the different modalities of this multi-functional viral protein.


Assuntos
Interações Hospedeiro-Patógeno , Biossíntese de Proteínas , Ribossomos , SARS-CoV-2 , Proteínas não Estruturais Virais , Humanos , Aminoácidos/química , Aminoácidos/genética , Ribossomos/metabolismo , RNA Mensageiro/genética , SARS-CoV-2/genética , SARS-CoV-2/metabolismo , Proteínas não Estruturais Virais/química , Sequência Conservada
19.
Int J Biol Macromol ; 253(Pt 4): 126980, 2023 Dec 31.
Artigo em Inglês | MEDLINE | ID: mdl-37729992

RESUMO

Site-directed mutagenesis is a valuable strategy for modifying enzymes, but the lack of understanding of conserved residues regulating glycosidase function hinders enzyme design. We analyzed 1662 enzyme sequences to identify conserved amino acids in maltohexaose-forming amylase at both family and subfamily levels. Several conserved residues at the family level (G37, P45, R52, Y57, D101, V103, H106, G230, R232, D234, E264, H330, D331, and G360) were found, mutations of which resulted in reduced enzyme activity or inactivation. At the subfamily level, several conserved residues (L65, E67, F68, D111, E114, R126, R147, F154, W156, F161, G163, D165, W218H, V342, W345, and F346) were identified, which primarily facilitate substrate binding in the enzyme's active site, as shown by molecular dynamics and kinetic assays. Our findings provide critical insights into conserved residues essential for catalysis and can inform targeted enzyme design in protein engineering.


Assuntos
Aminoácidos , Glicosídeo Hidrolases , Glicosídeo Hidrolases/genética , Sequência de Aminoácidos , Mutagênese Sítio-Dirigida , Domínio Catalítico , Especificidade por Substrato , Catálise , Sequência Conservada
20.
Arch Virol ; 168(10): 256, 2023 Sep 22.
Artigo em Inglês | MEDLINE | ID: mdl-37737963

RESUMO

Senecavirus A (SVA) can cause a vesicular disease in swine. It is a positive-strand RNA virus belonging to the genus Senecavirus in the family Picornaviridae. Positive-strand RNA viruses possess positive-sense, single-stranded genomes whose untranslated regions (UTRs) have been reported to contain cis-acting RNA elements. In the present study, a total of 100 SVA isolates were comparatively analyzed at the genome level. A highly conserved fragment (HCF) was found to be located in the 3D sequence and to be close to the 3' UTR. The HCF was computationally predicted to form a stem-loop structure. Eight synonymous mutations can individually disrupt the formation of a single base pair within the stem region. We found that SVA itself was able to tolerate each of these mutations alone, as evidenced by the ability to rescue all eight single-site mutants from their individual cDNA clones, and all of them were genetically stable during serial passaging. However, the replication-competent SVA could not be rescued from another cDNA clone containing all eight mutations. The failure to recover SVA might be attributed to disruption of the predicted stem-loop structure, whereas introduction of a wild-type HCF into the cDNA clone with eight mutations still had no effect on virus recovery. These results suggest that the putative stem-loop structure at the 3' end of the 3D sequence is a cis-acting RNA element that is required for SVA growth.


Assuntos
Picornaviridae , Animais , Suínos , DNA Complementar , Picornaviridae/genética , Vírus de RNA de Cadeia Positiva , Regiões 3' não Traduzidas/genética , Sequência Conservada
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...